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Abstract — Regenerating codes provide an efficient way to 
recover data at failed nodes in distributed storage systems. It 
has been shown that regenerating codes can be designed to 
minimize the per-node storage (called MSR) or minimize the 
communication overhead for regeneration (called MBR). In this 
work, we propose a new encoding scheme for [n, d] error- 
correcting MSR codes that generalizes our earlier work on 
error-correcting regenerating codes. We show that by choosing 
a suitable diagonal matrix, any generator matrix of the [n, a] 
Reed-Solomon (RS) code can be integrated into the encoding 
matrix. Hence, MSR codes with the least update complexity can 
be found. An efficient decoding scheme is also proposed that 
utilizes the [n, a] RS code to perform data reconstruction. The 
proposed decoding scheme has better error correction capability 
and incurs the least number of node accesses when errors are 
present. 

I. Introduction 

Cloud storage is gaining popularity as an alternative to 
enterprise storage where data is stored in virtualized pools of 
storage typically hosted by third-party data centers. Reliability 
is a key challenge in the design of distributed storage systems 
that provide cloud storage. Both crash-stop and Byzantine 
failures (as a result of software bugs and malicious attacks) 
are likely to be present during data retrieval. A crash-stop 
failure makes a storage node unresponsive to access requests. 
In contrast, a Byzantine failure responds to access requests 
with erroneous data. To achieve better reliability, one common 
approach is to replicate data files on multiple storage nodes in 
a network. Erasure coding is employed to encode the original 
data and then the encoded data is distributed to storage nodes. 
Typically, more than one storage nodes need to be accessed to 
recover the original data. One popular class of erasure codes 
is the maximum-distance-separable (MDS) codes. With [n, k] 
MDS codes such as Reed-Solomon (RS) codes, k data items 
are encoded and then distributed to and stored at n storage 
nodes. A user or a data collector can retrieve the original data 
by accessing any k of the storage nodes, a process referred to 
as data reconstruction. 

Any storage node can fail due to hardware or software 
damage. Data stored at the failed nodes need to be recovered 
(regenerated) to remain functional to perform data recon- 
struction. The process to recover the stored (encoded) data 
at a storage node is called data regeneration. Regenerating 



codes first introduced in the pioneer works by Dimakis et 
al. in Q], (2) allow efficient data regeneration. To facilitate 
data regeneration, each storage node stores a symbols and a 
total of d surviving nodes are accessed to retrieve (3 < a 
symbols from each node. A trade-off exists between the stor- 
age overhead and the regeneration (repair) bandwidth needed 
for data regeneration. Minimum Storage Regenerating (MSR) 
codes first minimize the amount of data stored per node, 
and then the repair bandwidth, while Minimum Bandwidth 
Regenerating (MBR) codes carry out the minimization in the 
reverse order. There have been many works that focus on the 
design of regenerating codes ll3l- lfT0l . Recently, Rashmi et 
al. proposed optimal exact-regenerating codes that recover the 
stored data at the failed node exactly (and thus the name exact- 
regenerating) [ITOl : however, the authors only consider crash- 
stop failures of storage nodes. Han et al. extended Rashmi's 
work to construct error-correcting regenerating codes for exact 
regeneration that can handle Byzantine failures iflTl . In ifTTIl . 
the encoding and decoding algorithms for both MSR and MBR 
error-correcting codes were also provided. In HI 21 . the code 
capability and resilience were discussed for error-correcting 
regenerating codes. 

In addition to bandwidth efficiency and error correction 
capability, another desirable feature for regenerating codes 
is update complexity |[T3ll . defined as the maximum number 
of encoded symbols that must be updated while a single 
data symbol is modified. Low update complexity is desirable 
in scenarios where updates are frequent. Clearly, the update 
complexity of a regenerating code is determined by the number 
of non-zero elements in the row of the encoding matrix with 
the maximum Hamming weight. The smaller the number, the 
lower the update complexity is. 

One drawback of the decoding algorithms for MSR codes 
given in is that, when one or more storage nodes have 
erroneous data, the decoder needs to access extra data from 
many storage nodes (at least k more nodes) for data recon- 
struction. Furthermore, when one symbol in the original data is 
updated, all storage nodes need to update their respective data. 
Thus, the MSR and MBR codes in fTTTl have the maximum 
possible update complexity. Both deficiencies are addressed in 
this paper. First, we propose a general encoding scheme for 
MSR codes. As a special case, least-update-complexity codes 



are designed. Second, a new decoding algorithm is presented. 
It not only provides better error correction capability but also 
incurs low communication overhead when errors occur in the 
accessed data. 

II. ERROR-CORRECTING MSR REGENERATING CODES 

In this section, we give a brief overview of data regenerating 
codes and the MSR code construction presented in IfTTI . 

A. Regenerating Codes 

Let a be the number of symbols stored at each storage 
node and f3 < a the number of symbols downloaded from 
each storage during regeneration. To repair the stored data at 
the failed node, a helper node accesses d surviving nodes. The 
design of regenerating codes ensures that the total regenerating 
bandwidth be much less than that of the original data, B. 
A regenerating code must be capable of reconstructing the 
original data symbols and regenerating coded data at a failed 
node. An [n, k, d] regenerating code requires at least k and 
d surviving nodes to ensure successful data reconstruction 
and regeneration IfTOl . respectively, where n is the number 
of storage nodes and k < d < n — 1. 

The cut-set bound given in J2), J5] provides a constraint on 
the repair bandwidth. By this bound, any regenerating code 
must satisfy the following inequality: 
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(1) 



From (0, a or f3 can be minimized achieving either the min- 
imum storage requirement or the minimum repair bandwidth 
requirement, but not both. The two extreme points in (Q3 are 
referred to as the minimum storage regeneration (MSR) and 
minimum bandwidth regeneration (MBR) points, respectively. 
The values of a and (3 for the MSR point can be obtained by 
first minimizing a and then minimizing /3: 



a = d — k + 1 

B = k(d-k + l) = ka 



(2) 



where we normalize ft as lQ 

There are two categories of approaches to regenerate data 
at a failed node. If the replacement data is exactly the 
same as that previously stored at the failed node, we call 
it the exact regeneration. Otherwise, if the replacement data 
only guarantees the correctness of data reconstruction and 
regeneration properties, it is called functional regeneration. 
In practice, exact regeneration is more desirable since there 
is no need to inform each node in the network regarding 
the replacement. Furthermore, it is easy to keep the codes 
systematic via exact regeneration, where partial data can be 
retrieved without accessing all k nodes. The codes designed 
in IfTOl . ifTTI allow exact regeneration. 

'it has been proved that when designing [n, k, d] MSR for k/ (n + 1) < 
1/2. it suffices to consider those with /3 = 1 1101 . 



B. MSR Regenerating Codes With Error Correction Capability 

Next, we describe the MSR code construction given in IfTTI . 
In the rest of the paper, we assume d = 2a. The information 
sequence m = [mo, mi, ... , ms-i] can be arranged into an 
information vector U = with size a x d such that 

Z\ and Z-i are symmetric matrices with dimension a x a. 
An [n, d = 2a] RS code is adopted to construct the MSR 
code IfTTI . Let a be a generator of GF(2 m ). In the encoding 
of the MSR code, we have 
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(4) 

and C is the codeword vector with dimension (a X n). G 
contains the first a rows in G and A is a diagonal matrix with 
(a°) a , (a 1 )", (a 2 ) Q ,..., (a"" 1 ) 11 as diagonal elements. Note 
that if the RS code is over GF(2 m ) for m > [log 2 na], then 
it can be shown that (a°) Q , (a 1 )", (a 2 ) Q , . . . , (a"" 1 )" are 
all distinct. After encoding, the ith column of C is distributed 
to storage node i for 1 < i < n. 

III. Encoding Schemes for Error-Correcting MSR 

Codes 

RS codes are known to have very efficient decoding algo- 
rithms and exhibit good error correction capability. From dU 
in Section III-B1 a generator matrix G for MSR codes needs 
to satisfy: 

G 



1) G 



GA 



where G contains the first a rows in G 



and A is a diagonal matrix with distinct elements in the 
diagonal. 

2) G is a generator matrix of the [n, a] RS code and G is a 
generator matrix of the [n, d = 2a] RS code. 
Next, we present a sufficient condition for G and A such that 
G is a generator matrix of an [n, d] RS code. 

Theorem 1: Let G be a generator matrix of the [n,a] 
RS code C a that is generated by the generator polynomial 
with roots a 1 , a 2 , . . . ,a n ~ a . Let the diagonal elements of A 
be (a°) a , (a 1 )", . . ., (a™- 1 )", where m > [log 2 n] and 
gcd(2 m — 1, a) — 1. Then G is a generator matrix of [n, d] RS 
code Cd that is generated by the generator polynomial with 
roots a 1 , a 2 , ... , a n ~ d . 

Proof: We need to show that each row of GA is a 
codeword of Cd, and all rows in G are linearly independent. 



Let c = (co, ci, . . . , c„_i) be any row in G. Then the 
polynomial representation of cA is 
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(5) 



Since c G G Q , c has roots o , a 2 , . . . , a n ~ a . Then it is easy 
to see that (J5j has roots a~ a+1 , a~ a+2 ,. . ., a n ~ 2a that clearly 
contain a 1 , a 2 , ... , a™ -2 ". Hence, cA S C^. 

In order to show that all rows in G are linearly independent, 
it is sufficient to show that cA ^ C a for all nonzero 
c e G Q . Assume that cA 6 G Q . Then J2i=o c i{ aC " x Y must 
have roots a 1 , a 2 , . . . , a n ~ a . It follows that c(x) must have 
a Q+1 , a a+2 , . . . , a" as roots. Recall that c(x) also has roots 
o , a 2 , ... , a n ~ a . Since n — 1 > d = 2a, we have n — a > 
a + 1. Hence, c(x) has n distinct roots of a 1 , a 2 , ... , a™. This 
is impossible since the degree of c(x) is at most n — 1. Thus, 
cA g G Q . ■ 

One advantage of the proposed scheme is that it can now 
operate on a smaller finite field than that of the scheme in ifTTI . 
Another advantage is that one can choose G (and A accord- 
ingly) freely as long as it is the generation matrix of an [n, a] 
RS code. In particular, as discussed in Section [I] to minimize 
update complexity, it is desirable to choose a generator matrix 
where the row with the maximum Hamming weight has the 
least number of nonzero elements. Next, we present a least- 
update-complexity generator matrix that satisfies (|4). 

Corollary 1: Let A be the one given in Theorem [TJ Let G 
be the generator matrix of a systematic [n, a] RS code, namely, 



G=[D\I] 



where 
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/ is the (a x a) identity matrix, and 



n— a+2 



= Ui(x)g(x) + bi(x) for < i < a — 1 



Then, G 



G 

GA 



is a least-update-complexity generator 



matrix. 

Proof: The result holds since each row of G is a nonzero 
codeword with the minimum Hamming weight n — a + 1. ■ 

IV. Efficient Decoding Scheme for 

ERROR-CORRECTING MSR CODES 

Unlike the decoding scheme in [ 1 1 1 that uses [n, d] RS code, 
we propose to use the subcode of the [n, d] RS code, the 
[n, a = k — 1] RS code generated by G, to perform the data 
reconstruction. The advantage of using the [n, k—1] RS code is 
two-fold. First, its error correction capability is higher (namely, 
it can tolerate j instead of L 1 ^] errors). Second, it 

only requires the access of two additional storage nodes (as 
opposed to d — k + 2 = k nodes) for the first error to correct. 



Without loss of generality, we assume that the data collector 
retrieves encoded symbols from k + 2v (v > 0) storage 
nodes, jo,ji, . . . ,jk+2v-i- We also assume that there are v 
storage nodes whose received symbols are erroneous. The 
stored information of the k + 2v storage nodes are collected 
as the k + 2v columns in Y ay ^ +2 v)- The k + 2v columns 
of G corresponding to storage nodes jo,ji, ■ • ■ ,jk+2v-i are 
denoted as the columns of Gk+2v First, we discuss data 
reconstruction when v = 0. The decoding procedure is similar 
to that in ifTOl . 

No Error: In this case, v — and there is no error in Y, 
Then, 



Y 



= UG k 
= [ZiZ 2 



G k 
G k A 

= [Z 1 G k + Z 2 G k A 



(7) 



Multiplying G^ and Y in (Q, we have IfTOl . 



G T k Y = 



G T k UG k 



= [G k Z\G k + G k Z2G k A] 
= P+QA. 



(8) 



Since Z\ and Z 2 are symmetric, P and Q are symmetric 
as well. The (i,j)th element of P + QA, 1 < i,j < k and 

i ^ j, is 

,0-1)0 



Pij + qijC 
and the (j, i)th element is given by 

M-l)a 



(9) 



(10) 



Pji + (/,,(! 

Since a" -1 '" ^ a^ _1 ^ Q for all i ^ j, = pji, and = 
qji, combining (O and ( TTTJb . the values of and qij can be 
obtained. Note that we only obtain k — 1 values for each row 
of P and Q since no elements in the diagonal of P or Q are 
obtained. 

To decode P, recall that P = G\Z\G k . P can be treated as 
a portion of the codeword vector, G\Z\G. By the construction 
of G, it is easy to see that G is a generator matrix of the 
[n, k — 1] RS code. Hence, each row in the matrix G\Z\G is 
a codeword. Since we have known k — 1 components in each 
row of P, it is possible to decode G\Z\G by the error-and- 
erasure decoder of the \n, k - 1] RS code0 

Since one cannot locate any erroneous position from the 
decoded rows of P, the decoded a codewords are accepted 
as G\Z\G. By collecting the last a columns of G as G a to 
find its inverse (here it is an identity matrix), one can recover 
G\Z\ from G\Z\G k . Note that a = k — 1. Since any a rows 
in Gj[ are independent and thus invertible, we can pick any a 
of them to recover Z\. Z 2 can be obtained similarly by Q. 

2 The error-and-erasure decoder of an [n, k — 1] RS code can successfully 
decode a received vector if s + 2v < n — k + 2, where s is the erasure (no 
symbol) positions, v is the number of errors in the received portion of the 
received vector, and n — k + 2 is the minimum Hamming distance of the 
[n, k - 1] RS code. 



Multiple Errors: Before presenting the proposed decoding 
algorithm, we first prove that a decoding procedure can always 
successfully decode Z\ and Zi if v < j and all storage 

nodes are accessed. Due to space limitation, all proofs are 
omitted in this section. 

Assume the storage nodes with errors correspond to the £oth, 
£ifh, . . ., £„_ith columns in the received matrix Y axn . Then, 

G Y a xn 

= G T UG + G T E 
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G T \Z X Z^ 



+ G E 



G 

GA 

\G T Z X G + G T Z 2 GA] + G T E , 



(11) 



where 



E 



ax(^ -l)l e ?f)l QX(«i-« -l)l 



6„_1 |0ax(n-^ l ,_ 1 )j 



Lemma 1: There are at least n — k + 2 errors in each of the 
fnth, £ith, . . ., 4,_ith columns of G T Y axn . 
We next have the main theorem to perform data reconstruction. 

Theorem 2: Let G T Y axn = P + QA. Furthermore, let P 
be the corresponding portion of decoded codeword vector to 
P and Ep = P © P be the error pattern vector. Assume that 
the data collector accesses all storage nodes and there are v, 
1 < v < ]' °f them with errors. Then, there are at 

least n — k + 2 — v nonzero elements in tj\h column of Ep, 
< j < v — 1, and at most v nonzero elements in the rest of 
columns of Ep. 

The above theorem allows us to design a decoding algorithm 
that can correct up to |_ n ~2 +1 j errors In particular, we need 
to examine the erroneous positions in G^ +3 i?. Since 1 < v < 



L- 



we have n- k + 2-v > +1> v. Thus, 



2 J' " w — L 2 

the way to locate all erroneous columns in P is to find out 
all columns in Ep where the number of nonzero elements in 
them are greater than or equal to |_ "~2 +1 j +1. After we locate 
all erroneous columns we can follow a procedure similar to 
that given in the no error case to recover Z\ from P. 

The above decoding procedure guarantees to recover Z\ 
when all n storage nodes are accessed. However, it is not 
very efficient in terms of bandwidth usage. Next, we present 
a progressive decoding version of the proposed algorithm that 
only accesses enough extra nodes when necessary. Before 
presenting it, we need the following corollary. 

Corollary 2: Consider that one accesses k + 2v storage 
nodes, among which v nodes are erroneous and 1 < v < 
^ n-k+i ^ xhere are at least v + 2 nonzero elements in ^jth 
column of Ep, < j < v — 1, and at most v among the 
remaining columns of Ep. 

Based on Corollary [2] we can design a progressive decoding 
algorithm [_14j that retrieve extra data from remaining storage 
nodes when necessary. To handle Byzantine fault tolerance, 

3 In constructing P we only get n — 1 values (excluding the diagonal). 
Since the minimum Hamming distance of an [n, k — 1] RS code is n — k + 2, 




the error-and-erasure decoding can only correct up to |_- 



-J errors. 
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Fig. 1. Failure-rate comparison between the previous algorithm in II II and 
the proposed algorithm for [20, 10, 18] MSR code 



it is necessary to perform integrity check after the original 
data is reconstructed. Two verification mechanisms have been 
suggested in ifTTI : cyclic redundancy check (CRC) and crypto- 
graphic hash function. Both mechanisms introduce redundancy 
to the original data before they are encoded and are suitable 
to be used in combination with the decoding algorithm. 

The progressive decoding algorithm starts from accessing 
k storage nodes. Error-and-erasure decoding succeeds only 
when there is no error. If the integrity check passes, then 
the data collector recovers the original data. If the decoding 
procedure fails or the integrity check fails, then the data 
collector retrieves two more blocks of data from the remaining 
storage nodes. Since the data collector has k + 2 blocks of 
data, the error-and-erasure decoding can correctly recover the 
original data if there is only one erroneous storage node among 
the k + 1 nodes accessed. If the integrity check passes, then 
the data collector recovers the original data. If the decoding 
procedure fails or the integrity check fails, then the data 
collector retrieves two more blocks of data from the remaining 
storage nodes. The data collector repeats the same procedure 
until it recovers the original data or runs out of the storage 
nodes. The detailed decoding procedure is summarized in 
Algorithm Q] 

The proposed data reconstruction algorithm for MSR codes 
is evaluated by Monte Carlo simulations. It is compared 
with the previous data reconstruction algorithms in irTTI . 
Each data point is generated from 10 3 simulation results. 
Storage nodes may fail arbitrarily with the Byzantine failure 
probability ranging from to 0.5. [n, k, d] and m are chosen 
to be [20, 10, 18] and 5 respectively. Figure Q] shows that the 
proposed algorithm can successfully reconstruct the data with 
much higher probability than the one presented in IfTTI at 
the same node failure probability. For example, at the node 
failure probability of 0.1, data for about 1 percent of node 
failure patterns cannot be reconstructed using the proposed 
algorithm. On the other hand, data for over 50 percents of node 



Algorithm 1: Decoding of MSR Codes Based on (n, k—1) 

RS Code for Data Reconstruction 

begin 

v = 0; j = k; 

The data collector randomly chooses k storage nodes 
and retrieves encoded data, Y ax j\ 
while v < L^Mp^J do 

Collect the j columns of G corresponding to 

accessed storage nodes as Gf, 

Calculate GjY aX j; 

Construct P and Q by using (0 and ( fTOb ; 

Perform progressive error-and-erasure decoding 

on each row in P to obtain P; 

Locate erroneous columns in P by searching for 

columns of them with at least v + 2 errors; 

assume that £ e columns found in the previous 

action; 

Locate columns in P with at most v errors; 
assume that £ c columns found in the previous 
action; 

if (£ e = v and £ c = k + v) then 

Copy the £ e erronous columns of P to their 
corresponding rows to make P a symmetric 
matrix; 

Collect any a columns in the above £ c 
columns of P as P a and find its 
corresponding G a \ 

Multiply the inverse of G a to P a to recover 

GjZi; 

Recover Z\ by the inverse of any a rows of 

Recover Zi from Q by the same procedure; 
Recover rh from Z\ and Z^\ 
if integrity-check(rh) = SUCCESS then 
[_ return rh; 

3^3 + 2; 

Retrieve 2 more encoded data from remaining 
storage nodes and merge them into Y aX j; 
_ v <— v + 1; 
return FAIL; 



failure patterns cannot be reconstructed using the previous 
algorithm in ifTTl . The advantage of the proposed algorithm is 
also overwhelming in the average number of accessed nodes 
for data reconstruction. Due to space limitation, the simulation 
results are omitted. 

V. Conclusion 

In this work we proposed a new encoding scheme for the 
[n, 2a) error-correcting MSR codes from the generator matrix 
of any [n, a) RS codes. It generalizes the previously proposed 
MSR codes in IfTTII and has several salient advantages. It 
allows the construction of least-update-complexity codes with 



a properly chosen systematic generator matrix. More impor- 
tantly, the decoding scheme leads to an efficient decoding 
scheme that can tolerate more errors at the storage nodes, 
and access additional storage nodes only when necessary. A 
progressive decoding scheme was thereby devised with low 
communication overhead. 

Possible future work includes extension of the encoding 
and decoding schemes to MBR points, and the study of 
encoding schemes with optimal update complexity and good 
regenerating capability. 
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